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CHAPTER  1 
Introduction 

Algorithms  for  various  computations  have  been 
known  and  studied  for  centuries,  but  it  is  only  recently 
that  much  theoretical  attention  has  been  devoted  to  the 
structure  of  computation.   Turing  machines,  recursive 
functions,  and  related  approaches  were  the  first  to  appear. 
These  models,  however,  while  providing  much  interesting 
mathematics  do  not  look  at  the  problem  from  a  practical 
standpoint.   In  "real"  computing,  no  one  uses  Turing 
machines  to  evaluate  polynomials  or  to  multiply  matrices, 
and  very  little  practical  significance  can  be  found  in 
results  using  that  approach.   On  the  other  hand,  recent 
work  by  Winograd,  Pan,  Belaga,  and  others  has  lead  to  very 
realistic  models  and  correspondingly  practical  results. 
"Practical  result"  is  used  here  to  mean  that  someone  using 
a  real  programming  language  can  apply  the  result  to  his 
work.   One  well  known  example  is  Horner's  scheme  for  poly- 
nomial evaluation,  which  has  been  proved  optimal  in  a 
certain  strong  sense;  a  FORTRAN  programmer  can  use  this 
result  in  that  he  will  know  that  no  better  method  is  extant 

This  thesis  is  concerned  with  establishing  lower 
bounds  on  the  number  of  comparisons  required  to  solve 
various  combinatorial  problems;  in  particular  the  problems 
of  testing  set  equality,  computing  the  maximum  of  a  set, 
and  computing  the  median  of  a  set  are  discussed. 


Optimality  considerations  with  respect  to  these  types  of 
problems  appear  to  have  been  considered  for  at  least  forty 
years  in  regard  to  tournament  systems:  What  is  the  small- 
est number  of  matches  required  to  determine  the  best  among 
n  contestants?  To  completely  rank  the  n  contestants? 
These  problems  translate  directly  into  problems  of  finding 
the  maximum  of  a  set  and  sorting  a  set,  respectively. 

In  proving  certain  combinatorial  algorithms 
minimal  the  most  powerful  technique  available  has  been  the 
so  called  "information  theoretic"  argument.   The  heart  of 
such  arguments  is  that  if  we  must  choose  one  out  of  n  items, 
and  each  comparison  gives  at  most  one  bit  of  information, 
then  at  least  log^n  comparisons  are  necessary  to  choose 
the  distinguished  item;  Steinhaus  Q63[]  appears  to  have  been 
the  first  one  to  use  this  method. 

The  weakness  of  this  approach  is  that  it  will 
only  handle  the  case  when  the  comparisons  are  allowed  be- 
tween actual  inputs,  rather  than  functions  of  the  inputs. 
Such  restrictions  are  reasonable  when  dealing  with  tourna- 
ments:  it  makes  no  sense  to  "compare",  say,  the  squares 
of  the  differences  between  contestants,  since  in  this  case 
a  comparison  is  defined  as  being  a  match  between  two  con- 
testants.  However,  when  the  problems  are  considered, 
instead,  to  be  those  of  searching  for  a  distinguished 
number  in  a  set  of  numbers,  it  is  not  unreasonable  to  want 
to  be  able  to  compare  various  functions  of  the  inputs. 


In  this  thesis  we  introduce  a  new  model  of  algo- 
rithms, tree  programs,  which  facilitate  combining  informa- 
tion theoretic  arguments  with  results  from  linear  algebra 
in  order  to  obtain  results  in  the  case  that  linear  functions 
of  the  inputs  are  allowed  in  comparisons.   Using  these  tree 
programs  it  would  also  be  possible  to  obtain  results  when 
comparisons  of  arbitrary  classes  of  functions  of  the  inputs 
are  permitted,  provided  enough  was  known  about  the  class 
of  functions  allowed;  however,  not  enough  seems  to  be  known 
about  classes  of  functions  beyond  the  linear  functions  and 
this  thesis  concentrates  on  that  class. 

In  this  area,  among  the  new  results  obtained  are 
that  determining  whether  two  finite  sets  of  integers,  A 
and  B,  are  equal  cannot  be  done  more  quickly  than  in 
c«n«log  n  comparisons,  n  =  max  (I  Al,|  B|  )  ,  even  if  compari- 
sons can  be  made  between  linear  functions  of  the  inputs, 
and  that  the  maximum  of  a  set  of  n  integers  cannot  be  com- 
puted more  quickly  than  n-1  comparisons,  even  if  compari- 
sons can  be  made  between  linear  functions  of  the  inputs. 
It  is  also  proved  that  if  comparisons  are  allowed  between 
exponential  functions  of  the  inputs  then  the  maximum  can 
be  computed  in  about  log~n  comparisons,  and  similarly  if 
polynomial  functions  are  allowed  the  median  can  be  computed 
in  fewer  than  2n  comparisons.   Tree  programs  are  also 
applied  to  problems  in  graph  theory;  by  some  elementary 
counting  arguments  it  is  shown  that  determining  various 


connectivity  properties  of  graphs  with  n  nodes  requires 

2 
about  n  operations. 

In  the  remainder  of  this  chapter  the  notation  is 
explained  and  then  a  history  of  previous  results  in  this 
area  is  given.   It  should  be  noted  that  this  history  is 
complete  only  in  so  far  as  the  literature  was  available, 
and  undoubtedly,  some  results  remain  buried  in  obscure 
foreign  journals,  unavailable  technical  reports,  and  un- 
published manuscripts.   In  the  second  chapter  tree  programs 
are  defined,  and  by  their  application  new  results  are  given. 
In  the  third  and  final  chapter  some  conclusions  are  stated 
and  open  problems  and  directions  for  future  research  are 
suggested. 

1 . 1   Notation 

The  integers  will  be  denoted  by  N = 
{   ...,  -2,  -1,  0,  1,  2,  ...j   and  R  will  denote  the 
rationals.   The  floor  and  ceiling  operations  are  defined 
as  in  £211  :  LXJ  is  the  largest  integer  less  than  or  equal 
to  x,  and  Txl  is  the  smallest  integer  greater  than  or 
equal  to  x.   Throughout  the  paper  the  terms  set  and  vector 
will  be  used  to  mean  finite  set  (vector)  of  integers. 
Whenever  a  set  A  is  an  input  to  an  algorithm,  it  is  assumed 
that  |A|,  the  cardinality  of  A,  is  also  an  input. 

We  use  the  notation  introduced  by  Landau  £31] 
for  the  order  of  magnitude  of  a  function.   We  say  that 
f(n)  =  0(g(n))  if  there  is  a  constant  k  >  0  such  that 


lim   sup  lf]nV         k  >  0 


g(n) 
n  ->  oo      y 


If 


lim    ;  ;  =  0  , 

g(n)     ' 
n-»-  oo^ 

then  we  say  that  f(n)  <  0(g(n))   (instead  of  Landau's 
f(n)  =  0(g(n)));  on  the  other  hand,  if 

,  .   I  f  (n)l 

lim  ) — [ —  =  oo  , 

g(n)       ' 
n  -*co  a 

we  say  that  f(n)  >  0(g(n)). 


1 . 2  History  of  Previous  Results 

1.2.1   Polynomial  Evaluation 

One  of  the  simplest  problems  in  the  area  of 

optimal  algorithms  is  to  determine  the  minimal  number  of 

multiplications  needed  to  compute  x   for  a  specific  n  and 

arbitrary  x.   Starting  with  input  x,  the  largest  number 

obtainable  with  k  multiplications  is  derived  by  squaring 

2k 
x,  squaring  the  result,  etc.,  that  is  x   .   Thus  we  must 


have 


hence 


2   .   n 
x    >  x 


k  >  log2n 


and  so  at  least  log^n  multiplications  are  required  to 
compute  x  .   Proving  that,  in  some  sense,  this  minimum  can 
be  achieved  is  more  difficult.   The  following  theorem  due 


to  Brauer  [4],  is  also  proved  in  Knuth  £30,  §  4.6.3, 
Theorem  DJ  ;  Val '  skii  []67j]  studied  the  problem  indepen- 
dently, arriving  at  the  same  result. 

Theorem  A:   Let  m(n)  be  the  fewest  number  of 
multiplications  required  to  compute  x   for  given  values  of 


x,  then 


p .     m(n)      , 
•vim   .  ,  __  _  ,  =  1 . 


n  ■+   co 


[_log2nj 


This  problem  readily  generalizes  to:   What  is 
the  minimal  number  of  arithmetic  operations  needed  to 
evaluate  the  n   degree  polynomial 

f(x)  =aQ  +  a-jX  +  ...  +  anxn   an  ^  0,      (1) 

for  given  values  of  x?   When  it  is  known  a  priori  that  the 

values  of  x  given  will  be  equally  spaced,  a  method  using 

finite  differences  might  be  most  convenient;  however,  we 

will  assume  that  the  x's  will  be  arbitrary.   The  answer 

then  usually  given  to  this  question  is  to  use  Horner's 

method: 

fA  =  a 
0    n 

fr+l  =  xfr  +  an-r-l    r  =  0»l>-.-»n-l, 

which  requires  n  additions  and  n  multiplications.   Can 
this  be  improved?   One  can  easily  find  specific  polynomials 
for  which  there  is  a  better  method;  for  example 

f(x)  =  1  +  x  +  2x2  +  x3  +  x4 


which  can  be  evaluated  in  two  multiplications  and  two 
additions  by  proceeding  as  follows: 

x2,  x2  +  1,  (x2+l) ((x2+l)  +  x) . 

However,  we  want  a  more  general  scheme,  one  which 
will  work  for  all  polynomials  and  for  all  values  of  x.   To 
state  the  problems  of  optimality  in  a  precise  framework, 
we  define  a  scheme  as  follows:   Let  o  denote  any  of  the 
arithmetic  operations  addition,  subtraction,  multiplication 
or  division.   A  scheme  for  the  evaluation  of  the  polynomial 
(1)  will  be  defined  as  the  sequence  of  operations 

pi  =  Qi°Ri     i  =  1,2,... ,m         (2) 

where  each  Q.  and  R.  is  either  the  variable  x,  or  a,  where 
0  <  k  <  n,  or  a  constant  independent  of  x,  an,a.,...,a  , 
or  p.  where  j  <  i ;  and  furthermore, 

Pm  =  f (x) 

m 

for  all  x,a0,a, , . . . ,a  .   Clearly  Horner's  method  is  a 
valid  scheme  for  polynomial  evaluation,  and  in  fact, 
Horner's  method  is  optimal,  in  the  sense  that  it  requires 
the  fewest  operations  necessary  in  this  type  of  scheme,  for 
we  have: 

Theorem  B :  Any  scheme  of  type  (2)  which  evalu- 
ates  all  n  degree  polynomials  has  at  least  n  additions/ 
subtractions  and  n  multiplications/divisions. 
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In  this  theorem,  the  necessity  of  the  n 
additions/subtractions  was  shown  by  Belaga  [2],  [3] 
(see  Knuth  £30,  ^  4.6.4,  Theorem  A]]  )  while  the  necessity 
of  the  n  multiplications/divisions  is  due  to  Pan  Q45J  . 

A  different  approach  can  be  taken  if  a  large 
number  of  values  P(x)  are  required.   Consider  the  following 
example,  due  to  Todd  Q65]]  .  We  want  to  evaluate  the  poly- 
nomial 

P    =   x      +   Ax      +Bx      +Cx      +Dx      +EX+F. 

Define  the  following  polynomials 

2 
P,  =  x  +  ax  =  x(a+x) 

P2  =  (pi+x+b) (p!+c) 
P3  =  (P2+d) (pi+e) 

and  determine  a,b,c,d,e,  and  f  such  that  P  =  P_  +  f .   This 
can  be  done  by  the  solution  of  linear  equations  and  a 
single  quadratic  equation.   Once  these  equations  have  been 
solved,  P  can  be  evaluated  using  only  three  multiplications 
and  seven  additions  using  the  sequence 

Pl'  P2'  P3'  P  =  P3  +  f  '  (3) 

a  savings  of  three  multiplications  at  the  expense  of  one 
addition  and  some  "preconditioning"  of  the  coefficients. 
Since  multiplication  is  usually  much  slower  than  addition 
and  since  we  sometimes  want  the  same  polynomial  evaluated 
at  many  arbitrarily  spaced  points,  the  above  method  can 


represent  a  significant  improvement  over  Horner's  method; 
for  example  Pan  [42J  gives  preconditioned  coefficients 
for  polynomial  approximations  to  standard  functions  which 
can  be  used  in  writing  efficient  submatrices  for  those 
functions. 

A  scheme  with  preconditioning  is  formally  defined 
as  scheme  (2)  in  which  the  Q.  and  R.  are  in  addition 
allowed  to  be  any  real  functions  of  the  coefficients  of 
the  polynomial  to  be  evaluated.   The  scheme  above, (3),  is 
an  example  of  such  preconditioning.   The  idea  of  precon- 
ditioning is  due  to  Motzkin  Q35]  and  he  showed  that  if  a 
scheme  with  preconditioning  computes  all  n   degree  poly- 
nomials then  it  contains  at  least  J_(n+l)/2 J  multiplica- 
tions (see  Knuth  E30,  S  4.6.4,  Theorem  M]  ).   Combining 
this  result  with  the  full  strength  of  the  result  of  Belaga 
mentioned  above,  we  have 

Theorem  C:   Any  scheme  with  preconditioning 
which  evaluates  all  n   degree  polynomials  has  at  least 
j_(n+l)/2|  multiplications  and  at  least  n  additions/ 
subtractions. 

Considerable  work  has  been  done  to  find  a  scheme 
with  preconditioning  which  attains  the  lower  bounds  of 
Theorem  C.   Early  papers  by  Motzkin  F35]  and  Knuth  \_29~\ 
gave  methods  which  evaluate  polynomials  of  degrees  four, 
five,  and  six  in  |_(n+l)/2J  +  1  multiplications  and  n  +  1 
additions;  Pan  ["41"]  ,  [[44]  has  given  similar  methods  for 
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n  <  12.   In  each  of  these  cases,  the  methods  are  applicable 
only  for  a  particular  value  of  n.   Pan  £41]]  gives  a  method 
valid  for  n  >  2  which  requires  J_(n+l)/2j  +  1  multiplica- 
tions and  n  +  2  additions/subtractions  and  in  £40]]   he 
gives  a  method  for  n  >  5  for  which  [_n/2j  +  2  multiplica- 
tions and  n  +  1  additions/subtractions  are  needed.   For 
n  >  3,  Knuth  £29]]  gives  a  method  using  n  +  1  additions/ 
subtractions  and  the  number  of  multiplications  varies 
between  I  (n+l)/2j  +  1  and  approximately  j  n.  Belaga  £2^ 
proves  that  [_(n+l)/2J  +  1  multiplications  and  n  +  1 
additions  suffice  to  evaluate  any  n   degree  polynomial, 
but  these  operations  may  involve  complex  numbers.   Finally, 
Eve  £l0^]  modified  Knuth' s  method  to  give  a  method  requiring 
In/2 J  +  2  multiplications  and  n  additions/subtractions,  all 
of  which  involve  only  real  numbers. 

The  best  general  algorithm  for  polynomial  evalua- 
tion (Eve's)  requires  only  the  minimum  number  of  additions/ 
subtractions,  however  it  unfortunately  requires  one  more 
than  the  minimum  number  of  multiplications.   It  is  known 
£30,  &  4.6.4,  exercise  33^]   that  when  n  is  odd  both  of 
these  lower  bounds  cannot  be  simultaneously  achieved,  and 
a  similar  result  holds  when  n  =  4  and  n  =  6  £30,  §  4.6.4, 
exercises  3  5  and  36]].   There  is  no  known  general  algorithm 
using  |_n/2j  +  1  multiplications  when  n  >  8,  although  such 
methods  are  known  for  n  =  4,  6,  and  8,  where  the  algorithms 
require  one  or  two  extra  addition/subtraction  operations 
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[30]  ,  [42]  . 

The  above  optimality  theorems  show  that  no  one 
method  will  work  for  all  polynomials  of  degree  n  unless  it 
has  a  certain  minimum  number  of  operations,  but  there  are 
some  "special"  polynomials  which  can  be  evaluated  far  more 
rapidly,  for  example,  a«x   requires  only  five  multiplica- 
tions and  no  additions,  instead  of  the  minimums  given  by 
Theorems  B  and  C.   There  are  "few"  such  polynomials  for 
Belaga  [3]  has  shown 

Theorem  D:   The  set  of  n   degree  polynomials 
which  can  be  evaluated  by  schemes  with  preconditioning  in 
fewer  operations  than  specified  in  Theorem  C  has  Lebesgue 
measure  zero  in  the  space  of  all  n   degree  polynomials. 

Pan  [43]  has  proved  a  similar  result  for  schemes 
without  preconditioning. 

Most  of  these  results  have  been  generalized  to 
polynomials  of  many  variables  and  to  rational  functions. 
In  particular,  such  results  are  given  in  [2],  [36],  [39]  , 
and  [43]  .  Some  of  the  results  have  also  been  obtained  for 
the  problem  of  simultaneously  evaluating  several  poly- 
nomials in  the  same  variable  [45]  and  for  the  specific  case 
of  the  simultaneous  evaluation  of  a  polynomial  and  its 
first  derivative  [30],  [37]  . 

1.2.2   Linear  Algebraic  Problems 

By  generalizing  the  notions  in  the  above  results 
to  arbitrary  fields,  Winograd  proved  some  very  elegant 
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theorems,  first  announced  in  £71]  and  L"743,  and  finally 
published  in  L"75j].   Let  F  be  a  field  and  let  x, ,x2,...,x 

be  a  set  of  indeterminates  (variables) .   The  question  then 
becomes,  what  is  the  minimum  number  of  field  operations 
needed  to  compute  the  m  field  elements 

Y±   6  F(x1? ,xn)    i  =  1, ,m. 

Winograd  gave  a  very  general  definition  of  a  scheme  without 
preconditioning,  and  considered  only  the  number  of  multipli- 
cations/divisions.  He  showed: 

Theorem  E:   Let  $  be  an  m  by  n  matrix  whose  ele- 
ments are  in  the  field  F,  let  <p   be  an  m  vector  of  elements 

in  F,  and  let  x  denote  the  n  column  vector  (x, ,...,x  )  so 

'         —  1 '    '  n 

that 

|  x  +  <P  €■  F(Xl,...,xn)m. 

If  there  are  u  column  vectors  in  <l  such  that  no  nontrivial 
linear  combination  of  them  (over  F)  is  in  F  ,  then  any 
scheme,  without  preconditioning,  computing  $  x  +  <P  requires 
at  least  u  multiplications/divisions. 

Pan's  result  on  the  number  of  multiplications 
needed  for  polynomial  evaluation  without  preconditioning 
(part  of  Theorem  B)  follows  from  this  theorem  as  a  corol- 
lary, for  here  6  is  the  1  by  n  +  1  matrix 

/  -,     2       rii 

\X,jC,X  ,  •  •  •  ,  X  / 

and  the  columns  of  Q   are  all  linearly  independent  so  that 
u  =  n. 
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We  also  have  another  corollary: 

Theorem  F :   Let  X  be  a  p  by  q  matrix  and  let  y_  be 
a  q  column  vector.   Then  to  compute  X  y  requires  at  least 
pq  multiplications/divisions  and  so  the  ordinary  method 
of  computing  X  y_  minimizes  the  number  of  multiplications/ 
divisions. 

This   follows    from  Theorem   E  by  defining 
fykifj=iq   +   k         1    <   k   <   q 


0      otherwise 


and  letting  x  =  U±1 , . . . ,xlq,x21 , . . . ,x2q, . . . ,xpq) . 

Winograd  similarly  generalized  Motzkin's  result 
on  the  number  of  multiplications  when  preconditioning  is 
allowed  (part  of  Theorem  C) : 

Theorem  G:   Let  $,  9,  x  and  F  be  as  in  Theorem  E. 
If  there  are  u  column  vectors  in  $  such  that  no  nontrivial 
linear  combination  of  them  (over  F)  is  in  F  ,  then  any 
scheme  with  preconditioning  computing  $  x  +  9  requires  at 
least  iu+lJ/2  multiplications/divisions. 

Motzkin's  result  follows  from  this  exactly  as 
Pan's  followed  from  Theorem  E,  and  we  have  a  corollary 
similar  to  Theorem  F: 

Theorem  H:   Let  X  and  y  be  as  in  Theorem  F,  then 
every  algorithm  for  computing  X  y  requires  at  least  pq/2 
multiplications/divisions  which  do  not  depend  only  on  the 
entries  of  X  or  only  on  the  entries  of  y. 
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Floyd  £131)  has  studied  some  of  these  problems, 
and  others  dealing  with  the  computation  of  quadratic  forms, 
from  a  linear  algebraic  point  of  view.   Among  other  results 
he  gives  easy  proofs,  using  only  basic  linear  algebra,  of 
Theorem  F  restricted  to  the  case  p  =  1  and  Theorem  H 
restricted  to  the  cases  p  =  1  and  p  =  q. 

Winograd  [[73],  E^S]  shows  the  possibility  of 
approaching  the  lower  bound  given  in  Theorem  H,  by  giving 
an  algorithm,  which  uses  preconditioning  to  compute  X  y_  in 
pfq/2]  +  |_q/2J  multiplications;  this  algorithm  then  leads 

to  an  algorithm  to  multiply  two  n  by  n  matrices  in 

2  3 

n  [~n/2~]  +  2n[_n/2J  or  approximately  n  /2  multiplications; 

Waksman  £68]  modified  this  algorithm  and  reduced  the  number 

of  multiplications  by  an  additional  n/2.   Winograd' s  result 

is  somewhat  surprising  since  the  usual  method  of  matrix 

3 
multiplication,  that  is  by  the  definition,  requires  n 

multiplications,  and  it  had  not  been  thought  that  this 

could  be  diminished. 

This  work  was  followed  very  closely  by  an 

astonishing  result  of  Strassen  [[64],  who  showed  that  two 

logp7 
n  by  n  matrices  could  be  multiplied  using  only  4*7n 

2  81 
or  about  4*7n     arithmetic  operations  J   Strassen 's 

method  is  based  on  a  clever  trick  by  which  he  can  multiply 

2  by  2  matrices  using  only  seven  multiplications  (instead 

of  eight)  and  eighteen  additions;  since  this  trick  does 

not  make  use  of  the  commutativity  of  multiplication,  the 
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method  generalizes  to  higher  order  matrices  by  decomposing 
them  into  blocks.   Strassen  goes  on  to  apply  his  methods 
to  matrix  inversion,  computing  the  determinant,  and  solving 

linear  systems  of  equations  and  he  shows  that  each  of 

log?7 
these  can  be  done  in  c-n    *   arithmetic  operations,  pro- 
vided certain  submatrices  are  nonsingular. 

It  is  not,  in  general,  known  whether  or  not 
Strassen' s  method  is  optimal.   Hopcroft  and  Kerr  £20]]  have 
worked  on  this  problem  and  they  give  a  generalization  of 
his  method  for  multiplying  m  by  2  times  2  by  n  matrices 
which  requires  |~(3m+l)  n/2~|  multiplications.   They  then  show 
that  this  number  of  multiplications  is  minimal  for  the 
cases  n  =  m  =  3  and  m  =  2,  n  arbitrary;  the  optimality  of 
Strassen 's  method  for  2  by  2  matrices  follows  immediately 
from  their  results. 

Several  years  prior  to  the  work  of  Strassen  and 
Winograd,  Kljuev  and  Kokovkin-Scerbak  £26^]  had  approached 
the  problem  of  the  solution  of  an  n  by  n  linear  system  in 
a  different  manner.   Using  a  detailed  examination  of  the 
number  and  placement  of  zeroes  in  the  matrix,  they  proved 

Theorem  I :   If  only  operations  on  entire  rows 

1  2 

are  permitted  then  — n(n-l) (2n-l)  +  n  additions/subtrac- 

o 

1    2 
tions  and  ^rn(n  +3n-l)  multiplications/divisions  are  re- 
quired to  solve  an  n  by  n  system  of  linear  equations. 

Since  these  are  exactly  the  numbers  of  operations 
required  by  Gaussian  elimination,  we  have  as  a  corollary 
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that  Gaussian  elimination  is  optimal,  when  one  is  re- 
stricted to  operating  on  entire  rows.   Strassen's  method 
is  faster,  but  it  uses  operations  on  submatrices  rather 
than  on  rows. 

In  []27J,  Kljuev  and  Kokovkin-Scerbak  extended 
their  previous  work  and  by  similar  methods  established 
lower  bounds  for  the  number  of  arithmetic  operations  re- 
quired to  transform  a  given  matrix,  not  necessarily  square, 
into  another  given  matrix  of  like  rank  and  dimensions. 
Again,  they  restricted  themselves  to  the  case  where  only 
operations  on  entire  rows  are  allowed.   They  showed 

Theorem  J:   Let  F,  denote  the  m  by  n  matrix  in 
which  there  are  t.  zeros  to  the  right  of  the  main  diagonal 
in  the  i    row,  q.  zeros  below  the  main  diagonal  in  the 

i    column,  and  ones  in  the  (i,i)  positions,  for  1  <  i  <  k; 

st 
moreover,  the  t.  ,  zeros  of  the  i  +  1    row  are  in  the 

st 
same  columns  as  the  t.  zeros,  the  q.  ..  zeros  of  the  i+1 

i       '      ^l+1 

column  are  in  the  same  rows  as  the  q.  zeros,  and  the  remain- 
ing elements  are  arbitrary.   Tehn,  when  one  is  restricted 
to  operations  on  entire  rows,  the  number  of  multiplications/ 
divisions  required  to  transform  an  m  by  n  matrix  A  of  rank 
m  to  an  F,  of  rank  m  is 

[(n-i)q±  +  (m-i)  (tjL+l)  -  q±t.]  . 


i=l 
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1.2.3   Basic  Arithmetic  Operations 

So  far,  we  have  been  concerned  only  with  how 
many  operations  need  to  be  performed,  independent  of  the 
time  required  for  individual  operations.   The  usual  method 
for  adding/subtracting  two  n  digit  numbers  requires  time 

proportional  to  n  and  the  usual  method  of  multiplication 

2 

requires  time  proportional  to  n  .   Can  either  of  these 

methods  be  improved  upon?   For  addition/subtraction  no 
substantial  improvement  is  possible  since  the  usual  time 
is  about  n  "cycles",  and  there  are  2n  inputs  (digits)  while 
on  each  cycle  one  can  use  at  most  two  of  the  inputs. 
Multiplication  however  can  be  done  more  quickly  than  the 

usual  method.   Karat suba  and  Of man  £223  developed  a  method 

log93     1.57 
which  requires  time  proportional  to  n      «  n     ;   Toom 

£66^]  generalized  this  algorithm  and  proved 

Theorem  K;   For  all  «■  >  0  there  is  a  constant 

c(e)  and  a  multiplication  algorithm  such  that  the  time 

required  to  multiply  two  n  digit  numbers  is  less  than 

/  s  1+  e 
c(e)n 

Schbnhage  £50J  devised  a  different  algorithm 

which  also  required  at  most  times  c«n     and  he  showed 

that  his  algorithm  could  be  implemented  in  time  proportional 


to  n2  v//21°g2n(log2n)3/2.   Cook  [5]  showed  how  to  adopt 
Toom ' s  method  so  that  it  could  be  implemented  in  time 


proportional  to  n2    '   g2  .   A  good  discussion  of  all 
these  results  can  be  found  in  Knuth  Q30,  §  4.3.3]]  . 
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Unfortunately,  there  is  still  a  gap  between  the 
obvious  lower  bound  for  multiplication,  time  proportional 
to  n,  and  the  best  known  algorithms,  time  proportional  to 

na   P  °g  n   for  appropriate  a  and  {3.   The  only  non-trivial 
minimality  result  is  due  to  Cook  and  Aanderaa  [6]  where 
they  prove 

Theorem  L;   On  a  bounded  activity  machine,  a 
"super"  Turing  machine,  multiplication  cannot  be  performed 

in  less  than  time  proportional  to  = «  • 

(log  log  n) 

Ofman  [38]  introduced  the  idea  of  studying  the 
delay  time  of  a  circuit  and  the  number  of  elements  in  a 
circuit  which  computes  some  given  function,  say  addition. 
Toom  £66 J  extended  this  work  to  multiplication  and  Winograd 
[69],  [70j  and  Spira  [56],  [57],  [58],  [59],  [60],  [61] 
derived  lower  bounds  on  the  number  of  delays  and  elements 
required  for  various  operations.   Elementary  discussions 
of  some  of  this  material  appears  in  [l]  and  [72]  .   Most 
of  this  work  is  not  germane  to  the  present  discussion 
since  it  deals  with  circuitry  rather  than  computational 
schemes  of  a  programable  nature,  and  we  will  not  pursue  it 
further. 

1.2.4   Sorting,  Maximum  and  Median 

Problems  of  sorting  a  finite  set,  determining 
the  maximum  of  a  set,  and  in  general  determining  the  k 
element  in  a  set  of  n  >  k  elements  arise  in  computer 
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science  almost  as  much  as  problems  involving  the  evaluation 
of  polynomials.   These  problems  have  arisen  in  classical 
mathematics  as  problems  in  tournament  elimination  systems, 
where  there  are  n  contestants  of  unequal  ability  and  the 
dominance  relation  between  pairs  of  contestants  is  transi- 
tive.  Here  we  are  allowed  only  pairwise  comparisons  and 
the  questions  of  interest  are  what  is  the  minimal  number 
of  such  comparisons  required  to  completely  rank  the  con- 
testants?  To  determine  the  best  contestant?   To  determine 
the  median  contestant?   What  is  the  minimum  expectation  of 
the  number  of  comparisons?  Moon  f34,  §  16]]  contains  a 
discussion  of  some  of  these  and  other  questions. 

Theorem  M;   Let  M(n)  denote  the  minimum  number 
of  comparisons  needed  to  rank  n  objects  according  to  some 
transitive  relation.   Then 

1  +  J_log2(ni)J  <  M(n)  <  1  +  n[log2nJ  . 

The  upper  bound  follows  from  some  of  the  better 
algorithms  for  sorting,  which  usually  require  1  +  n|_log„nJ 
comparisons;  this  type  of  method  was  first  devised  by 
Steinhaus  L"62]],  in  the  form  of  the  binary  insertion  sort, 
long  before  the  advent  of  computer  sorting.   The  lower 
bound  is  derived  from  the  now  well  known  information 
theoretic  approach.   This  method  appears  to  have  been  dis- 
covered independently  by  several  authors,  Steinhaus  £63] 
first,  then  Ford  and  Johnson  £15]],  and  finally  Hoare  [~18~]« 
There  has  been  some  refinement  on  the  upper  bound, 
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particularly  by  Ford  and  Johnson  £15]  '     they  developed  a 
method  which  requires  1  +  |_log2(ni)J  comparisons  for 
n  <  11  but  requires  extra  comparisons  beyond  that. 
Steinhaus  f63]  has  conjectured  that  the  lower  bound  in 
Theorem  M  is  always  attainable;  the  first  unsettled  case 
is  n  =  12. 

Kislicyn  £23],  £24]  took  a  somewhat  different 
approach  to  the  problem  and  he  proved 

Theorem  N:   An  asymptotic  lower  bound  on  the 
expected  number  of  comparisons  used  to  rank  n  elements  is 


j^r  (n-tn(n)-n)  +  n-F(    "   -   -  1)  +  0(ln(n)) 


where 


w  *  _   l-tn(l+x)    2£x-tn(2(l+x))] 
FU)  "   ln2)    "      1+x 


(2) 

Suppose  that  there  are  n  contestants  and  we  want 
to  know  which  is  the  k   best  in  a  ranking;   Sobel  £55] 
has  termed  this  the  "selection  problem".   On  the  other 
hand,  if  we  wish  to  know  the  k  best  and  their  ranking, 
Sobel  £54]  calls  this  the  "ordering  problem".   When  k  =  1 
and  k  =  2,  the  ordering  and  selection  problems  coincide 
and  a  minimal  solution  for  one  is  also  a  minimal  solution 
for  the  other.   The  case  k  =  1  is  trivial  and  a  simple 
induction  argument  shows  that  n  -  1  comparisons  are  re- 
quired.  For  k  =  2,  the  problem  of  what  the  minimal  number 
of  comparisons  was  first  posed  by  Steinhaus  in  1929  and 
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it  was  solved  by  Schreier  [^51^;   later,  a  more  elegant 
solution  was  given  by  Slupecki  f52].   They  showed  that  at 
least  n  -  1  +  |_log2(n-l)J  comparisons  are  required. 
Kislicyn  £25[]  then  showed  that  the  k   best  could  always 
be  found  in  at  most 

n+1 


k-1 
n  -  1  +  T  |_log2(n-i)J  for  k  < 
i=l 

Summing  this  up,  we  have 

Theorem  0:   Let  V  (n)  denote  the  minimum  number 
of  comparisons  needed  to  find  the  k   largest  element  from 
a  set  of  n  different  elements.   Then 

k-1  - 

V  (n)  <n-l+2l  L1°92(n-i)J'  k^   2   ' 

i=l 

and  equality  holds  for  k  =  1  and  k  =  2. 

Sobel  [[54]],  T55n,  Hadian  [[163,  and  Hadian  and 
Sobel  [_112   consider  various  properties  of  some  algorithms 
which  solve  the  ordering  and  selection  problems.   Among 
the  results  they  obtain  are  asymptotic  bounds  for  the  ex- 
pected number  of  comparisons. 


The  selection  problem  when  k  = 


n 
2 


is  especially 


interesting  since  this  is  the  computation  of  the  median  of 

a  set  of  numbers.   Floyd  £14]]  has  devised  an  algorithm  in 

3 
which  the  expected  number  of  comparisons  is  -^  n  and  he 

shows  that  no  algorithm  can  have  a  lower  expectation.   It 

is  still  an  open  problem  whether  the  median  can  be  computed 

in  fewer  than  proportional  to  n*log  n  pairwise  comparisons. 
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Pohl  £46j  approached  the  selection  problem  from 
an  entirely  different  angle,  that  of  minimizing  the  amount 
of  storage  needed  to  determine  the  k  '  largest  element. 
He  showed  that  at  least  min(k,n-k+l)  storage  locations 
are  required. 


1.2.5   Searching  Problems 

The  problem  of  determining  if  a  given  element  is 
in  a  given  finite  set  is  generally  known  as  a  search  prob- 
lem, and  it  arises  frequently  in  computational  problems. 
There  is  a  well  known  optimal  algorithm  for  searching  an 
ordered  list  of  n  elements,  namely  the  binary  search  which 
seems  to  have  originated  with  Steinhaus  £62^.   This  method 
requires  [""log-nl  binary  comparisons  and  this  is  optimal 
by  a  simple  information  theoretic  argument.   This  algorithm 
is  also  optimal  in  an  expectation  sense,  for  Sandelius  [_^9~\ 
has  shown  that  the  minimum  possible  expected  number  of 
comparisons  is  riog„n~l  -  e(n)  where  e(n)  =  0  if  n  is  a 

2|_log2nJ 

power   of    two,    and   1    >   e(n)    =  >   0      otherwise. 

7  n 

A  somewhat  related  problem  is  the  discovery  of 
the  single  counterfeit  coin,  either  heavier  or  lighter,  in 
a  group  of  n  coins;  this  problem  is  well  known  in  the 
literature  of  recreational  mathematics.   One  is  usually 
allowed  to  use  a  balance  scale  and  hence  the  comparisons 
are  of  linear  functions,  over   {-1,0,1}  ,  of  the  inputs 
rather  than  just  pairwise  comparisons.   The  only  minimality 
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result  known  for  this  problem  is  due  to  Smith  [^53^]  and  he 
has  shown  that  when  n  =  (3  -l)/2,  k  such  comparisons  are 
necessary  and  sufficient. 


1.2.6   Integral  Equations 

Emel'yanov  and  II 'in  [_9~]   approached  an  optimality 
question  from  a  very  unique  point  of  view.   The  problem  is 
to  approximate  the  solution  to  a  Fredholm  integral  equation 
of  the  second  kind: 

y(P)  =  J  K(P°Q)y(Q)  dQ  +  f(P) 
D 

where  D  is  an  m  dimensional  region,  and  Emel'yanov  and 

II 'in  established  a  lower  bound  on  the  minimum  number  of 

operations  required  to  approximate  the  solution  to  within 

a  prescribed  accuracy.   They  also  showed  that  a  known 

method  operates  within  this  bound. 


CHAPTER  2 
New  Results 

2.1   The  Model 

Algorithms  are  represented  as  finite  trees, 
called  tree  programs,  where  every  node  of  the  tree  has 
the  form 

i)   Replacement    a  <—  b 
ii)   Function       a  <—  f(b,c) 

iii)   Comparison     a:b 
For  deterministic  algorithm,  replacement  and  function 
nodes  have  degree  one  (only  one  subtree) ,  while  compari- 
son nodes  have  degree  either  two  (equal  and  not  equal 
subtrees)  or  three  (less  than,  equal,  and  greater  than 
subtrees) ;  in  non-deterministic  algorithms  the  degrees 
are  arbitrary.   Since  this  paper  deals  only  with  deter- 
ministic algorithms,  the  word  "deterministic"  is  omitted 
throughout . 

For  example,  a  tree  program  which  determines 
whether  two  vectors  x  =  (x, ,x2)  and  y  =  (y  ,y?)  are  equal 
might  look  like: 

^\ 

x2:y2        \ 

NO 


YES 


N( 
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and  a  tree  program  which  solves  the  quadratic  equati 

2 
ax  +  bx  +  c  =  0  might  be: 


on 


a:0 


c:0 


SOLUTION 


SOLUTION  IS 
-c/b 


e—  4-a-c 


EVERY  X 

IS  A     INCONSISTENT, 


NO  SOLUTION 


TWO  COMPLEX 
ROOTS 

-b±iy1d[ 

2a 


d—d-e 


d:0 


DOUBLE 

ROOT    AT 

-b/(2a) 


TWO  REAL 
ROOTS 

-b±Vd 
2a 


This  model  has  been  chosen  because  it  seems  to 
capture  the  essentials  of  information  flow,  thus  making 
it  easy  to  get  bounds  on  the  number  of  comparisons  by 
simply  counting  the  total  number  of  terminal  nodes  in  the 
tree.   The  model  also  allows  facts  about  the  functions  in 
type  (ii)  nodes  to  be  used,  while  previously  known  informa- 
tion theoretic  arguments  [[15 J ,  l^D?  and  £63^]  can  easily 
be  put  into  this  framework  by  disallowing  nodes  of  type 
(ii). 
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In  general,  there  are  two  different  types  of 
function  nodes:   Those  involving  functions  which  can  be 
used  in  comparisons,  and  those  involving  functions  which 
are  not  allowed  to  be  used  in  comparisons.   For  example, 
we  may  want  to  use  multiplication  in  "bookkeeping"  but 
not  allow  comparisons  between  products  of  the  inputs.   The 
reason  for  such  a  restriction  is  that  some  problems  become 
much  simpler,  by  orders  of  magnitude,  when  "complicated" 
functions  of  the  inputs  are  permitted  in  comparisons. 
These  ideas  are  made  precise  as  follows: 

Let  F  =  {f.,  f2,  ...,  f  |  be  a  set  of  functions, 

f .  :  R  x  R  -*•  R.   Define  F1  =  F  and   Fk  = 

l 


Fk  -1  U  {f(g(  )  ,h(  ))|  f  eF;  g,h^Fk  X  \   j      then  defi 


ne 


oo 

F   =  ,U.  F  .    Thus  F*  is  the  set  of  all  functions  which 
k=l 

can  be  constructed  from  F  using  finitely  many  compositions. 
The  set  of  tree  programs  over  the  sets  of  functions  F  and 
G,  denoted  T(F,G),  is  defined  as  all  possible  finite  tree 
programs  in  which  all  of  the  function  odes  are  functions 
from  F  U   G  and  the  comparison  nodes  consist  only  of  com- 
parisons between  actual  inputs,  constants,  and  functions 
from  G   applied  to  inputs  and  constants.   Functions  from 
F  -  G  applied  to  inputs  cannot  be  used  in  comparisons; 
furthermore,  any  substitutions  or  sequences  of  computations 
which  effect  such  comparisons  are  also  prohibited. 

A  set  of  functions  F  will  be  called  ignorant  if 
max(a,b)  ^   F  ;  that  is,  if  the  result  of  the  comparison 
a:b  cannot  be  deduced  by  using  a  finite  number  of 
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compositions  of  functions  from  F.   For  example,  the  set 
{+,-,•,/,  sign}   is  not  ignorant  since 


max 


(         x    (1  +  sign  (a-b))*a  +  (1  -  sign  (a-b) ) *b 

Va , D)     —  ~  ; 


while  the  set  { +,  -,  •)      ±s   ignorant  since  every  function 
in  t+5  -?  *i   must  clearly  be  everywhere  dif f erentiable 
in  all  variables,  while  max(x,y)  is  not  dif f erentiable  on 
the  line  x  =  y  =  z  and  hence  max(a,b)  ^     {  +,  -,  • ]    . 

In  general  when  working  with  the  tree  programs 
in  T(F,G) ,  we  will  want  F  U   G  to  be  ignorant  so  that 
"information  theoretic  arguments"  will  make  sense.   For 
example,  if  F  were  not  ignorant,  then  the  maximum  of  a  set 
of  numbers  could  be  computed  using  no  comparisons,  which 
is  intuitively  wrong i   Since  we  are  usually  interested  in 
the  case  where  F  U  G  c  {   +,  -,  • ,  /,  t  }  ,  we  show: 

Theorem  1 :   F  =  {+,  -,  •,  /,  t  )     is  ignorant. 

Proof :   The  proof  is  similar  to  the  argument 
which  shows  that   {+,  -,  •}   is  ignorant,  but  here  we  are 
bothered  with  discontinuities  caused  by  "/"  and  the  fact 
that  with  negative  numbers  and  rational  powers,  "f"  takes 
us  into  the  complex  plane.   Consider  the  set  7    of  the 
extensions  of  the  functions  in  F  to  the  complex  plane, 
with  the  point  at  infinity.   Clearly  all  functions  in  j 
are  analytic,  while  no  complex  extension  of  z  =  max(x,y) 
could  possibly  be  analytic  since  it  is  not  dif f erentiable 
on  the  line  x  =  y  =  z  in  the  real  numbers.   Hence  F  is 
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ignorant.   Q.E.D. 

Certain  classes  of  tree  programs  are  especially 
interesting,  and  will  occur  frequently  enough  in  the  dis- 
cussion to  warrant  giving  them  special  names.   We  define 

0  =  T({  +  ,  -,  *,  /,  t } ,  0)  =  Simple  tree  programs, 

c£  =  T(f +,  -,  • ,  /,  t  } ,  (  +  ,-})  =  Linear  tree 

programs, 

^  =  T((+,  -,  •,  /,  t  }  ,  {+,-,•})  =  Polynomial 

tree  programs, 

ft,    =  T({  +  ,  -,',/,  \\  ,  {+,-,-,/})  =  Rational 

tree  programs, 

and      &     =  T({  +,-,*,/,  t]  ,  l+,  -,-,/,  tl)  =  Exponential 

tree  programs. 

We  now  introduce  three  functions  T(F,G)  ->  N  which 

in  some  sense  measure  the  complexity  of  a  tree  program  in 

T(F,G).   Let  t  e  T(F,G),  then 

H(t)  =  height  of  the  tree;  that  is,  the  length 

of  the  longest  path  from  the  root  to  a 

terminal  node, 

C(t)  =  the  largest  number  of  comparisons  found 

along  any  path  from  the  root  to  a 

terminal  node, 

and       W(t)  =  the  width  of  the  tree;  that  is,  the  number 

of  terminal  nodes. 

We  have  the  following  relationship  between  H,  C,  and  W: 

H(t)  >  C(t)  >  log-W(t) .  (3) 


29 


Since  the  trees  are  finite,  looping  is  precluded, 
and  so  a  tree  program  can  work  only  for  a  particular  class 
of  inputs,  for  example,  only  for  vectors  of  dimension  n, 
for  a  fixed  value  of  n.   The  optimality  results  still  have 
valid  and  useful  interpretations  for  "real"  programming 
languages,  with  loops,  because  given  a  particular  input, 
we  can  "unfold"  the  resultant  program  to  get  a  tree  struc- 
ture allowed  by  the  model. 

2. 2   Set  Equality  and  Related  Problems 

In  this  section  we  consider  some  algorithms  for 
testing  vector  equality,  set  equality,  set  containment, 
emptiness  of  intersection,  and  several  similar  problems. 
Some  of  these  results  were  originally  presented  in  a 
weaker  form  by  the  author  in  £48^  . 

Proposition:   For  any  F  and  G,  there  is  a  tree 
program  of  height  n,  in  T(F,G)  which  determines  whether  or 
not  two  n-dimensional  vectors  are  equal;  furthermore, 
that  height  is  minimal. 

Proof :   The  algorithm  is  the  obvious  one,  element 
by  element  comparison.   Minimality  is  established  by  noting 
that  at  every  node  at  most  two  of  the  inputs  can  be  used 
in  a  computation,  so  if  the  tree  had  height  n  -  1,  or  less, 
the  output  could  not  be  a  function  of  all  2n  inputs.  Q.E.D. 

The  optimality  proof  in  the  above  proposition  is 
very  elementary.   It  merely  asserts  that  the  algorithm 
must  look  at  each  input  before  the  correct  output  can  be 
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given,  a  fact  so  obvious  in  this  case  it  hardly  needs 
mentioning.   In  general,  however,  the  proofs  will  be 
nowhere  near  as  trivial.   Many  of  the  proofs  of  minimal 
height  will  be  given  by  showing  that  O(W)t))  =  0(f(n)) 
for  an  input  of  dimension  n  and  hence  using  (3)  we  have 
0(H(t))  >  0(log  f(n)).   Such  proofs  tell  us  nothing  new 
about  the  height  when  0(log  f(n))  <  0(n)  for  as  in  the 
proof  of  the  above  proposition,  when  there  are  n  inputs, 
we  must  have  H(t)  >  n  in  order  to  "read"  all  the  inputs. 
These  arguments  are  still  worthwhile  in  that  they  give 
lower  bounds  for  W(t)  and  C(t) ,  each  of  which  is  a  valid 
measure  of  the  complexity  of  the  algorithm. 

We  will  now  consider  the  problem  of  determining 
whether  two  given  sets  are  equal.   Since,  by  assumption, 
the  sets  are  finite  sets  of  integers,  the  most  obvious 
algorithm  is  to  order  the  sets  by  using  some  sorting 
procedure  and  then  to  compare  element  by  element;  when 
the  sorting  procedure  used  is  0(n*log  n) ,  and  when  the 
only  operations  allowed  in  comparisons  are  addition  and 
subtraction,  this  method  is  optimal. 

Lemma:   There  is  a  linear  tree  program  of  height 
0(n*log  n)  which  sorts  vectors  of  length  n,  for  any  given 
n. 

Proof :   The  treesort  of  Floyd  Ql2[]  requires 
0(n«log  n)  operations.   This  algorithm  can  easily  be  put 
into  the  form  of  a  tree  program  for  any  given  n.   Q.E.D. 
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Theorem  2 :   Given  two  sets  A  and  B,  there  is  a 
linear  tree  program  of  height  O(n«log  n) ,  where 
n  =  max(  I  Al  , I B  I  )  ,  which  determines  whether  or  not  A  =  B; 
furthermore,  the  order  of  magnitude  of  that  height  is 
minimal  in  <&  . 

Proof :   Without  loss  of  generality  we  can  assume 
that  I  Al  =  I  Bl  since  the  dimensions  are  given  in  the  input 
and  it  requires  only  one  comparison  to  test  and  answer  "no" 
if  lAl  ^    iBl .   Moreover,  we  can  consider  A  and  B  to  be 
vectors  x  and  y  (respectively)  and  then  the  question  "is 
A  =  B?"  is  equivalent  to  the  question  "is  x  a  permutation 
of  y?"   This  can  be  answered  in  O(n*log  n)  steps  by  sorting 
both  x  and  y  using  an  O(n*log  n)  sorting  algorithm  and  then 
doing  an  element  by  element  comparison. 

As  to  the  minimality,  we  will  actually  show 
something  much  stronger  than  stated  in  the  theorem,  namely 
that  the  width  is  at  least  0(n:);  then  using  the  inequality 
(1)  and  Stirling's  formula  we  conclude  that  not  only  is 
the  height  at  least  O(n*log  n) ,  but  the  number  of  compari- 
sons is  also  O(n*log  n) . 

The  minimality  of  O(nJ)  as  the  width  is  proved 
by  showing  that  each  permutation  must  cause  the  algorithm 
to  terminate  at  a  different  leaf  of  the  tree.   Suppose 
that  P  is  a  tree  program  which  determines  whether  A  =  B, 
and  further  suppose  that  two  different  permutations  ft ,  and 
7f   cause  P  to  terminate  at  the  same  leaf  of  the  tree. 
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That  is,  on  input 

x  =  (1 ,  2,  3 ,  .  .  .  ,  n) 

(4) 
y  =  7T    {1,    2,  3,  ...  ,  n) 

the  tree  program  terminates  at  the  same  leaf  as  on  input 

x=  (1 ,  2,  3,  •••)  n) 

(5) 
y'=  W2(l,    2,  3,  ...  ,  n) 

and  T*1   ■£    7T  2# 

Since  71      ^     Jf        there  are  integers  i  and  j  such 

that  TMi)  ^  7r2(i)  and  ^l^  ^  7T2^)  '      Hence  the  Path 
which  P  follows  on  inputs  (4)  and  (5)  cannot  possibly  con- 
tain any  of  the  comparisons 

V^U)   Va7T2(i)   V^Cj) 

or   bj:a7r2(j)  <6> 

on  an  input  a,  b,  because  these  comparisons  give  different 
answers  on  input  (4)  than  on  input  (5) .   Furthermore,  if 
there  is  a  sequence  of  comparisons 

x.  :y .     x.  :y .    ...    x.  :y . 

Xl   Jl     X2      H  Xk  Jk 

from  which  the  result  of  any  of  the  comparisons  in  (6)  can 
be  uniformly  deduced  (that  is,  in  all  cases,  regardless 
of  the  values) ,  then  that  sequence  could  not  have  occurred, 
For  example,  suppose  that 

x^  :Y a         x.  :y.   ...  x.  :y.  =^>  b.:a~  ,.x 

the  results  of  the  comparisons  in  the  implicand  are  the 
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same  for   TT   and   5T   (or  else  the  program  would  have 
ended  on  different  terminal  nodes  for   2T   and  2T  ) ,  and 
hence  the  result  implied  for  b.  :a~  ,  .*  must  be  the  same 

for  both   3T   and   3T  .   Since  the  actual  comparison  for 
3T,  and  ^2  yields  different  answers,  the  implication  is 
wrong  for  one  of   3T,  or   ^2. 
Consider  the  input 
x  =  (k,  2k,  3k,  ...,  n*k) 
y  =  ^1(t1,  ...,  tn) 

'  I  *k  if  t^i  and  l^j 
where  tp  =  <  u    -L  =  i 

I   =  J 


(7) 


''I 


v 


k,  u,  and  v  are  variables.   The  sequence  of  comparisons 
made  by  P  on  the  path  followed  on  inputs  (4)  and  (4)  can 
be  represented  as 

tVcl 


f1:c1 


f2:c2 


where  each  f.  is  some  linear  functional  of  the  inputs,  and 
each  c.  is  a  constant.   In  the  case  of  input  (7)  these 
reduce  to 

axu  +  P1v:c1  +  Y1k   ...   a^u  +  p^vic^  +  Y^ 

(8) 
where  a  ■  ,  |3 .  ,  and  /  .  are  constants  determined  by  the 
linear  functionals.   The  specific  results  of  the  compari- 
sons on  inputs  (4)  and  (5)  give  a  sequence  of  linear 
inequalities  which  is  consistent  when  k  =  1  by  the  argu- 
ment in  the  previous  paragraph. 
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If  the  area  of  solution  of  the  inequalities  (8) 
is  infinite  when  k  =  1,  then  clearly  we  have  many  positive 
integer  solutions  for  u  and  v,  of  which  only  two  are  such 
that  x  is  a  permutation  of  y;  yet  P  must,  by  construction, 
terminate  at  the  same  leaf  of  the  tree  on  all  of  these, 
and  hence  P  must  make  an  error.   The  area  defined  by  the 
inequalities  (8)  is  non-zero  when  k  =  1,  because  the  two 
inputs,  (4)  and  (5),  give  two  solutions;  thus  if  the  area 
is  finite,  it  consists  of  a  convex  polygon  containing  at 
least  two  different  points.   As  k  grows  this  polygon  in- 
creases proportionately  in  area,  and  sooner  or  later  for 
some  k,  this  region  of  solution  must  contain  at  least  three 
integer  solutions  for  u  and  v,  of  which  only  two  are  such 
that  x  is  a  permutation  of  y;  again,  P  must  terminate  at 
the  same  leaf  of  the  tree  in  all  cases  and  hence  make  an 
error.   Contradiction;  hence  for  every  different  permuta- 
tion, P  must  terminate  at  a  different  leaf,  thus  there 
are  at  least  ni  leaves  in  P  and  so  P  must  have  height  at 
least  O(log_n:)  =  O(n-log  n) .   Q.E.D. 

As  a  corollary  to  Theorem  2  we  get  a  generaliza- 
tion of  the  well  known  result  in  Cl5|],  E183,  and  £63]: 

Corollary:   The  optimal  sorting  algorithm  in  o£ 
requires  O(n*log  n)  operations. 

Proof:  If  sorting  could  be  done  more  quickly, 
then  the  question  "is  A  =  B?"  could  be  answered  in  fewer 
than  O(n-log  n)  operations,  contradicting  Theorem  2.   Q.E.D. 
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Unfortunately,  this  method  of  proof  does  not  ' 
generalize  to  tree  programs  from  (r   ,  iZ   ,  or  Q,   .   For 
example,  the  area  defined  by  the  inequalities 

y-x2>0-k=0 

x  >  2k 

y  <  k 
is  non-zero  when  k  =  1 7  however,  for  k  >  4,  the  area  is 
zero. 

There  are  many  instances  in  which  we  want  to 
test  the  equality  of  two  sets,  but  the  members  of  the  sets 
cannot  be  sorted,  for  example,  if  the  members  of  the  sets 
were  sets,  trees,  geometric  shapes,  etc.   In  these  cases 
it  may  be  difficult  (or  impossible)  to  impose  a  linear 
ordering,  a  prerequisite  to  sorting,  and  thus  we  are 
interested  in  the  case  where  tree  programs  are  not  allowed 
to  sort.   One  interpretation  of  this  is  that  comparisons 
are  not  allowed  between  two  elements  from  the  same  set; 
since  the  operations   {  +,  -}  usually  go  with  a  linear 
ordering,  we  exclude  them  also.   We  have: 

Theorem  3 :   If  we  restrict  Theorem  2  to  simple 
tree  programs  and  we  do  not  allow  comparisons  between  two 

elements  of  A  or  between  two  elements  of  B,  then  the  best 

2 
tree  program  has  height  0(n  ),  where  n = max (I  A  I ,  I  B I ) . 

Proof :   The  proof  will,  in  effect,  construct  an 

input  for  which  any  correct  simple  tree  program  will  have 

to  go  through  its  longest  path  to  reach  the  answer  that 
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A  =  B,  and  we  will  see  that  the  length  of  this  path  must 
be  at  least  n(n+l)/2,  thus  proving  the  lemma. 

Let  P  be  a  tree  program  in  which  there  are  no 
comparisons  of  the  form  a.:a.  or  b. :b..   Now  consider  the 
following  path  through  the  tree  program: 

a)  On  a  comparison  with  a  constant,  take  the 
branch  where  the  constant  is  less  than  the 
other  operand. 

b)  On  a  comparison  a.:b  .  ,  take  the  branch  where 
a.  >  b.,  unless  this  branch  gives  an  immedi- 
ate answer  of  A  /  B.   If  that  happens,  take 
the  branch  where  a ■  =  b . . 

Claim:   The  first  place  we  reach  a  node  where  we 
must  choose  a.  =  b  will  be  no  higher  than  at  the  nth  com- 
parison which  involves  either  a,  or  b  . 
F  t     s 

Proof  of  claim:   When  we  reach  that  node,  assume 
that  we  have  compared  only  elements  from  the  sets 

C  =  {a.  ,  ...,  a   }  CA  =  (a  ,  ...,  a  } 

11       xk         L  n 

and 

D  =  lb   ,  ...,  b.   }  CB  =  (b  ,  ...,  b   }  . 

Jl  Jt  L  n 

We  first  show  that  -t   >  n  -  k.   Suppose  that  -i  <  n  -  k, 
i.e.  -t   <   n  -  k  -  1.   Then  we  might  have  each  a.  in  C  equal 
to  one  of  the  b.  in  B  -  D,  and  each  a.  in  A  -  C  equal  to 
one  of  the  b.  in  D.   Hence  we  could  not  have  been  forced 
to  make  at  equal  to  b  ;  thus  t   >  n  -  k.   Secondly,  we  must 
have  compared  a.  with  every  b.  in  D,  and  b  with  every  a. 

I-  J  £s  X 
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in  C,  or  else  a  similar  argument  shows  that  we  were  not 

yet  forced  to  make  a.  equal  to  b  .   Thus  when  we  do  reach 

a  node  where  we  must  make  a,  equal  to  b  ,  we  must  have  made 

t  s 7 

a  total  of  1  +  k  >  (n  -  k)  +k=n  comparisons.   Q.E.D. 
claim. 

Let  C(N)  be  the  minimum  number  of  comparisons 
between  a. 's  and  b. 's  for  a  simple  tree  program  which 
correctly  determines  what  we  want.   Notice  that  after  dis- 
covering that  a   =  b  we  have  at  least  C(n-l)  comparisons 
left,  for  we  can  set  a,  =  b  =  constant  through  the  pro- 
gram, thus  changing  at  least  n  comparisons  from  type  (b) 
to  type  (a) .   We  have  left  a  program  which  works  on  sets 
each  having  one  less  element. 

Thus 

C(n)  >  n  +  C(n  -  1)  (9) 

and  clearly 

C(l)  >  1.  (10) 

Together,  (9)  and  (10)  imply  that 

C(n)  >  n(n+l)/2. 
Therefore  P  has  height  at  least  0(n  ).   Q.E.D. 

Another  interpretation  of  the  no-sorting  con- 
dition is  to  allow  only  equal  and  not  equal  comparisons, 
since  without  a  linear  ordering,  greater  than  and  less  than 
make  no  sense. 

Theorem  4 :   If  we  restrict  Theorem  1  to  simple 
tree  programs  and  we  allow  only  (  =  ,  ^  )  comparisons  and 
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do  not  allow  (<,  =,  >)  comparisons,  then  the  best  tree 

2 
program  has  height  0(n  ),  where  n  =  max(  I  A  I, I B I ) . 

Proof;   Let  P  be  a  simple  tree  program  which 

determines  whether  A  =  B,  and  in  which  all  comparisons 

are  (  =  ,  ^).   Define  a  new  simple  tree  program,  P', 

constructed  from  P  as  follows:   At  every  node  in  P  of  the 

form  a.:a.  or  b.:b.  delete  that  node  and  its  "="  subtree, 

leaving  only  the  "-^"  subtree.   From  the  construction  of 

P'  it  is  clear  that  if  no  a.  =  a.  and  no  b.  =  b.  then  P! 

i    J         i    J 

correctly  determines  whether  A  =  B. 

Notice  that  Theorem  3  and  its  proof  are  also 
correct  when  the  input  sets  A  and  B  are  not  allowed  to 
have  repeated  elements,  that  is  a.  =a.  orb.  =b.  if  and 

only  if  i  =  j .   From  this  strengthened  version  of  Theorem 

2 
3,  we  see  that  the  height  of  F'  is  at  least  0(n  ) ,  and  by 

construction,  the  height  of  P  is  greater  than  or  equal  to 

2 
the  height  of  P ' .   Thus  the  height  of  P  is  at  least  0(n  ). 

Q.E.D. 

From  Theorems  3  and  4,  we  readily  conclude  that 
the  algorithm  which  makes  all  possible  n(n-l)/2  comparisons 
is  the  best  we  can  do. 

We  have  the  following  corollaries  to  Theorems  2, 
3,  and  4 : 

Corollary  1 :   (a)  Given  two  sets  A  and  B,  there 
is  a  linear  tree  program  of  height  O(n-log  n) ,  where 
n  =  max (I  A  1,1  Bl)  ,  which  determines  whether  or  not  A  9  B; 
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furthermore,  the  order  of  magnitude  of  that  height  is 
minimal  for  oC .   (b)  If  we  restrict  (a)  to  simple  tree 
programs,  and  either  comparisons  between  members  of  the 
same  set  are  prohibited  or  only  (  =  ,  ■£  )  comparisons  are 
allowed,  then  the  best  tree  program  has  height  0(1  Al-  IBI). 

Proof :   The  algorithms  are  obvious.   In  all 
three  cases,  minimality  of  the  order  of  magnitude  is 
established  by  noting  that  we  can  test  A  =  B  by  testing 
A  c  b  and  B  £   A.   Q.E.D. 

Corollary  2:   (a)   Given  two  sets  A  and  B,  there 
is  a  linear  tree  program  of  height  0(n#log  n) ,  where 
n  =  max (I  A  I , I B I ) ,  which  computes  I  A  C\  B I ;  furthermore, 
that  height  is  minimal  for  ©C  .   (b)   If  we  restrict  (a) 
to  simple  tree  programs,  and  either  comparisons  between 
members  of  the  same  set  are  prohibited  or  only  (=,  ^  ) 
comparisons  are  allowed,  then  the  best  tree  program  has 
height  0(1  A  I • IBI  ) . 

Proof :   For  both  (a)  and  (b)  the  algorithms  are 
obvious.   Since  I  A  OBI  =  IAI  if  and  only  if  A  =  B,  the 
orders  of  magnitude  of  the  heights  are  minimal  by  Theorems 
2,  3,  and  4.   Q.E.D. 

We  want  to  ask  how  hard  it  is  to  "construct"  a 
set.   To  this  end  we  augment  the  possible  nodes  of  tree 
programs  with  "output  (t) " ,  a  node  of  degree  one  which 
causes  the  value  of  the  variable  t  to  be  recorded.   We  say 
that  a  tree  program  constructs  a  set  S  if  at  the  end  of 
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executing  the  tree  program,  exactly  the  members  of  S  have 
been  recorded  by  the  executed  output  nodes. 

Corollary  3:   (a)  Given  two  sets  A  and  B,  there 
is  a  linear  tree  program  of  height  0(n*log  n)  ,  where 
n  =  max(  I  A  I , I B I ) ,  which  constructs  A  C\  B;  furthermore, 
the  order  of  magnitude  of  that  height  is  minimal  for  <&  . 
(b)  If  we  restrict  (a)  to  simple  tree  programs,  and  either 
comparisons  between  members  of  the  same  set  are  prohibited 
or  only  (  =  ,  j4  )  comparisons  are  allowed,  then  the  best  tree 
program  has  height  0(|Al*IB|). 

Proof :   The  algorithm  is  again  obvious.   Given 
a  tree  program  which  constructs  A  n  B,  modify  it  to 
initially  set  a  counter  to  zero  and  add  one  to  that  counter 
in  place  of  every  output  node;  then  at  the  end,  this 
counter  contains  I  A  HBl  and  minimality  is  established  by 
Corollary  2.   Q.E.D. 

Theorem  5 :   (a)  Given  two  sets  A  and  B,  there  is 
a  linear  tree  program  of  height  0(n*log  n) ,  where 
n  =  max(!  A  |  ,  I  B  I  )  ,  which  determines  whether  or  not  A  riB=)2(; 
furthermore,  the  order  of  magnitude  of  that  height  is 
minimal  for  di   .   (b)  If  we  restrict  (a)  to  simple  tree 
programs,  and  either  comparisons  between  members  of  the 
same  set  are  prohibited  or  only  (=,  ^  )  comparisons  are 
allowed,  then  the  best  tree  program  has  height  0(lAI*|BI). 

Proof :   (a)  The  algorithm  is  the  usual  one  of 
sorting  both  A  and  B  and  comparing  them.   We  only  outline 
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a  proof  that  O(n«log  n)  is  minimal.   Given  a  set  C  where 
c.  =  c.  if  and  only  if  i  =  j  we  use  an  algorithm  which 
determines  whether  or  not  A  O  B  =  0   on  the  inputs 

C  =  {  10cil  c±  e  C  } 

c  =  {ioc.  +  lie.  eel. 

Before  deciding  that  C  H  C"  =  0,  the  algorithm  must  know 
that  c .'  ^  c.    for  every  pair  i,j.   The  algorithm  can  be 
modified  to  know  whether  c.'  <  c'!   or  c.'  >  c'.'   for  every 
pair  i,j  without  increasing  its  height;  this  means  that 
at  the  time  it  decides  that  C  AC"  =  0,  it  must  also  have 
enough  information  to  sort  C,  that  is,  to  sort  C.   Thus, 
by  a  slightly  stronger  form  of  the  corollary  to  Theorem  2, 
which  is  proved  by  observing  that  we  can  strengthen 
Theorem  2  by  considering  as  inputs  only  sets  A  such  that 
a .  =  a .  if  and  only  if  i  =  j ,  we  know  that  the  height  is 
at  least  O(n«log  n) .   (b)  The  proof  is  very  similar  to 
that  of  Theorems  3  and  4  and  it  will  not  be  given.   Q.E.D. 

The  problems  of  "what  is  A  -  B?",  "what  is 
I A  -  Bl?",  and  "is  A  -  B  =  0?"  are  handled  very  much  like 
those  in  the  above  theorems  and  corollaries. 

2. 3   The  Maximum  and  Median  of  a  Set 

The  following  result  considers  the  minimal  width, 
W(t) ,  of  a  tree  program  instead  of  the  number  of  comparisons 
(as  in  Theorem  0)  or  the  number  of  storage  locations  (as 
in  [46]  )  : 
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Theorem  6 :   Given  a  set  of  n  integers  A,  any 
simple  tree  program  which  determines  the  i    greatest 
element  of  A  has  width  at  least  i-(.). 

Proof :   The  crux  of  the  proof  is  that  to  find 
the  i    greatest  element  in  A,  we  must  know  for  every  ele- 
ment  in  A  whether  it  is  greater  than  or  less  than  the  i 
element.   In  other  words,  the  only  permutations  of  the  set 
of  inputs  {l,  2,  ...,  n}  which  can  cause  the  tree  program 
to  terminate  at  the  same  node  are  those  which  separately 
permute  the  elements  above  the  i    and  below  the  i   , 
wi  t hout  i  nt  ermi xi  ng . 

Suppose  that  on  the  input 
(1 ,  2,  . . . ,  n) 
the  tree  program  terminated  at  the  same  node  as  on  the 
input 

3T(1,  2,  .  .  .  ,  n)  , 
where  '7T  interchanges  an  element  t,  greater  than  the  i 
element  (n  -  i  +  1) ,  with  an  element  less  than  n  -  i  +  1. 
Clearly,  at  the  node  at  which  the  tree  program  terminated 
on  these  inputs,  it  cannot  be  known  or  deduced  that 
t  >  n  -  i  +  1;  the  most  which  can  be  known  or  deduced  is 
that  t  is  less  than  some  numbers  above  n  -  i  +  1  and  is 
greater  than  some  numbers  below  n  -  i  +  1.   Hence  by  inter- 
changing t  and  n  -  i  +  1,  we  can  cause  the  tree  program  to 
follow  the  same  path  and  thus  to  make  the  error  of  choosing 
t  as  the  i   l  element,  instead  of  n  -  i  +  1 .   Hence 
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comparisons  alone  cannot  give  the  i  l  element;  since  we 
are  dealing  with  simple  tree  programs  and  {  +,  -,  •  ,  /,  t  } 
is  ignorant,  the  i    element  cannot  be  calculated  by 
function  evaluations.   Thus  the  only  permutations  which 
can  terminate  at  a  common  node  are  those  which  do  not  inter- 
mix the  "large"  and  "small"  elements. 

There  are  (n  -  i)J(i  -  1)1    permutations  which 
separately  permute  the  upper  and  lower  elements.   Thus, 
since  there  are  a  total  of  nj  permutations  and  at  most 
(n  -  i);(i  -  1)1    can  cause  the  tree  program  to  terminate 
a  single  node,  there  must  be  at  least 


(n  -  1)  HI   -  1)  1        *    vi 

different  terminal  nodes,  so  that  W(t)  >  i*(.).   Q.E.D. 

When  i  is  a  constant,  then  we  have 
W(t)  >  i-(.)  =  0(n    )  and  so  computing  the  maximum 
requires  0(n    )  width.   On  the  other  hand,  in  computing 
the  median  we  have  i  =  t:   and  hence 


=  \i 


n 


W(t)  >  § 


n 
"n" 

2 


n, 


=  0(2") 


by  Stirling's  formula.   Unfortunately,  in  each  case 
0(log  W(t) )  is  less  than  or  equal  to  0(n)  and  so  the  implied 
restriction  on  the  height  given  by  this  theorem  is  also 
given  by  a  far  simpler  argument  (c.f.  the  proof  of  the 
proposition  in  the  previous  section) ;  however,  the  result 
in  Theorem  6  is  of  interest  since  it  shows  something  non- 
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trivial  about  the  width  of  the  tree  program. 

It  seems  unlikely  that  the  lower  bounds  on  the 
width  given  by  Theorem  6  can  be  attained  by  simple  or  even 
linear  tree  programs.   These  lower  bounds  can  be  attained 
however,  in  the  case  of  the  maximum,  by  exponential  tree 
programs,  and,  for  the  median,  by  polynomial  tree  programs: 

Theorem  7 :  Over  exponential  tree  programs  the 
maximum  of  a  set  of  n  integers  can  be  computed  with  width 
n;  that  is,  by  a  tree  program  with  C(t)  =  f100^  nl  * 

Proof :   Let  { x,  ,  x~ ,  ••.,  x  }  be  the  input  set 
and  without  loss  of  generality  assume  that  n  =  2  .   Notice 
that 

(n  +  1)  1  +  ..-  +  (n  +  l)  n/2>(n  +  l)  n/2+1  +  •  •  •  +  (n  +  1)  n 

(11) 
if  and  only  if  the  maximum  occurs  in  { x,  ,  .  .  .  ,  x  /~  }  . 
This  gives  a  binary  algorithm  for  the  maximum  which  re- 
quires a  height  proportional  to  n  and  a  width  of  exactly 
n.   Q.E.D. 

Notice  that  as  stated  and  proved,  Theorem  7  is 
restricted  to  sets  of  integers  and  not  sets  of  real  numbers, 
This  result  can  be  generalized  to  sets  of  real  numbers  if 
we  assume  that  no  element  of  a  set  is  repeated,  that  is 
x.  =  x .  if  and  only  if  i  =  j  .   Define 


d  =  2  Y    (x.-x.)2 


i>j   ±   j      (xi-xj)2 


45 


and  observe  that  d  has  the  property 


—  <  -r;|x.-x.l    for  all  i,i. 
d    2   1   j  '-1 


/n/2  dXn/2+l\  /        *n 

+    \2n  :       2n  +    ...    +      2n 


instead  of  the  comparison  indicated  in  (11) ,  achieves  the 
same  purpose,  thus  extending  Theorem  7. 

Theorem  8 ;  Over  polynomial  tree  programs  the 
median  of  a  set  of  n  different  real  numbers  can  be  com- 
puted by  a  tree  program  with  C(t)  <  2n. 

Proof:   Let  |x, ,  •••5  x  }     be  the  input  and  we 
have  that  x.  =  x.  if  and  only  if  i  =  j .   Compute 
p,  =  i  I   (x,-x.)  for  k  =  1,  2,  ..•,  n.   Notice  that 
exactly  one-half  of  the  p,  ' s  will  be  negative  and  one-half 
will  be  positive.   Since  the  median  is  x,-  -,  ,  the  sign  of 

ITI 


Pp  -.  is  a  function  only  of  n,  say  f(n).   Clearly  then,  if 
'  2  ! 

sign  (p  )  ^ sign  (p    )  =  f(n) 


x.  cannot  be  the  median.   Since  exactly  half  of  the  p,  ' s 

1  K. 


will  differ  in  sign  from  p 


j 


we  can  eliminate  half  of 


n 

2 

the  elements  from  possibly  being  the  median  by  using  n 

comparisons  to  find  the  signs  of  the  p,  's.   We  now  have 

I  2   elements  left  which  might  be  the  median  and  we  can 

continue  recursively.   The  number  of  comparisons  made  is 
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C(t)  =  n  +  n-   +§+•'•+  ^T1 <    2n-       Q.E.D. 

v    '  2        4  log0    n    - 

2    2 
Theorem  8  can  be  easily  generalized  so  as  to 

compute  the  ^n     element  instead  of  the  median. 

Lemma  1 ;   There  do  not  exist  (  £Q,  t  q)    and 
( 4  ,  £  )  such  that  max{a..x  +  b1y  +  c,  , . . . ,  ax  +  b  y  +  c  i 
occurs  at  a.x  +  b.y  +  c,    for  (x,  y)  =  (  £  Q,  £Q)  and  at 
a  .x  +  b.y  +  c,  for  (x,  y)  =  (  £  -p  ^-j^),  where  i  ^  j  ,  if 
and  only  if  a.  =  a.  and  b.  =  b.  for  all  i  and  j  . 

Proof:   Elementary  casing  out  of  the  signs  and 
relative  sizes  of  max£a.i,  min{a,  |,  max(b.  },  min£b.  ], 
max{c. {,  and  mintc.}  gives  the  result.   Q.E.D. 

Lemma  2 :   For  any  n,  f(x,  y)  = 

max  Ja,  x  +  b,  y  +  c,  ,  .  .  .  ,  ax  +  by  +  c  \     is  in 

1     la     1 7     '   n     n     n 

{  +,-,",/,  1  J   if  and  only  if  a.  =  a.  and  b.  =  b.  for 

all  i  and  j . 

Proof:   If  a.  =  a.  and  b .  =  b .  for  all  i  and  j, 

then  the  maximum  is  f(x,  y)  =  a,x  +  b,y  +  c,  and  hence 

f(x,  y)  ^  { +,  -,  • ,  /,  t  \    .   Conversely,  notice  that  for 

some  i,  the  maximum  must  occur  at  a.x  +  b.y  +  c  for 
'  i     1-*     l 

infinitely  many  (x,  y) ;  furthermore,  all  such  (x,  y)  lie 
in  an  infinite  convex  set,  C.   Suppose  that 
f(x,  y)  e  [ +,  -,  *,  /,  t  ]  ,  then  f  is  clearly  analytic 
for  all  (x,  y) ,  and  C  obviously  contains  an  infinite  num- 
ber of  distinct  points  and  their  limit  point.   Thus  we 
can  apply  the  identity  theorem  for  analytic  functions 
£28,  page  87J  and  hence  we  have  that  f(x,  y)  =  a.x  +  b.y  +  c, 
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contradicting  Lemma  1,  which  says  that  the  maximum  can 
occur  in  other  places.   Hence  f(x,  y)  ^  I  +,  -,  *,  /,  f  }  • 
Q.E.D. 

Theorem  9 :   Let  P  be  a  linear  tree  program 
which  computes  the  maximum  of  a  set  of  n  numbers;  if  p  is 
any  path  from  the  root  of  a  leaf  for  which  there  is  an 
input  which  causes  P  to  follow  p,  then  p  contains  at  least 
n  -  1  comparison  nodes. 

Proof :   Suppose  p  contained  n  -  2  or  fewer  com- 
parison nodes.   These  can  be  characterized  as 

n 
Y~   a. .x.  :  b.     i  =  1,  2,  ....  n  -  2. 

i^l   1J  J     1 

Since  by  assumption  there  is  a  set  { x,  ,  x2,  ...,  x  }      for 
which  p  is  followed,  there  must  be  some  vector 

(e-p  e2,  ••-,  ^n_2^  such  that 

n 

21  a. .x.  =  b.  +  e,     i  =  1,  2,  ...,  n-2   (12) 
j=l    J  J         x 

is  a  consistent  set  of  linear  equations. 

By  standard  results  of  linear  algebra  [11, 
Theorems  5.1  and  5.3^,  we  know  that  the  solutions  to  the 
equations  (12)  form  a  subspace  of  dimension  at  least  two, 
that  is,  constitute  at  least  an  entire  plane.   Clearly 
every  solution  to  (12)  will  also  cause  P  to  follow  p, 
since  the  results  of  the  comparisons  will  be  unchanged; 
hence  there  is  at  least  an  entire  plane  of  points  in  n- 
dimensional  space  which  cause  P  to  follow  p. 
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We  can  solve  (12)  for  x_  ,  x„ ,  ...,  x  ,  in 

terms  of  x,  and  x2,  and  we  get 

x.  2  =  aixi  +  ^jX?  +  ci    i=l,  2,  ...,n-2. 

By  Lemma  1,  max(x, ,  ...,  xn)  =  max(x,  ,  x2,  aixi  +  ^n xt + ci ? 

....  a   ~x,  +  b   0x0  +  c   0)  can  occur  in  at  least  two 
7   n— Z   1     n-z  z     n— z 

different  places,  since  the  coefficients  are  not  all  the 
same.   Thus  at  the  end  of  the  sequence  of  comparisons  on 
p,  it  is  impossible  to  determine  the  maximum  unless  the 
computations  yielded  the  maximum.   Let  f(x.,  x~ ,  •  ••>  x  ) 
be  the  culmination  of  all  of  the  computation  along  p,  so 
that  f (x, ,  x25  •••5  xn)  is  the  maximum.   By  definition  of 
P,  f  is  in  { +,  -,  *,  /,  T  }   and  by  the  above  argument, 

I  V  X-i  ,  X^,   ...,  X  /   —  I-  \  X-,  ,    9  1  11       10       1  )   ***? 

an-2Xl  +  bn-2x2  +  cn-2}  =  g(xl'  x2>  e  *+'  "'  *  '  7>  j  }*  > 
which  contradicts  Lemma  2.   Thus  p  must  contain  at  least 

n  -  1  comparisons.   Q.E.D. 

The  bound  of  n  -  1  for  path  length  given  in  the 
above  theorem  is  sharp,  because  the  "usual"  algorithm  to 
find  the  maximum  (by  saving  the  largest  element  encountered 
as  the  set  is  searched)  requires  n  -  1  comparisons  along 
every  path. 

Corollary:   The  optimal  linear  tree  program  which 
computes  the  maximum  of  a  set  of  n  numbers  has  width  at 


least  2n. 


Proof :   Immediate  from  the  theorem.   Q.E.D. 
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2.4   Cycles  and  Connectivity  in  Graphs 

In  this  section  we  show  that  certain  well  known 
algorithms  for  detecting  cycles  and  connectivity  of  graphs 
are  optimal  to  within  a  multiplicative  constant,  when  it 
is  assumed  that  the  graph  is  presented  as  a  binary  con- 
nection matrix.   The  results  are  due  to  Holt  and  the 
author  and  were  originally  presented  in  El9H  . 

An  n  by  n  binary  matrix  M  is  said  to  contain  a 
cycle  if  and  only  if  there  is  a  sequence  of  elements  in  M 
such  that 

M(k15k2)  =M(k2,k3)  =  •••  =M(km_1,km)  =M(km,k1)  =1. 

Node  i  is  said  to  be  connected  to  node  j  in  M  if  and  only 
if  there  is  a  sequence  of  elements  of  M  such  that 

Md,^)  =M(k15k2)  =  •••  =M(km_1,km)  =M(km,j)  =  1. 

Finally,  M  is  said  to  be  strongly  connected  if  and  only  if 
for  every  pair  of  nodes  i  and  j ,  node  i  is  connected  to 
node  j . 

Marimont  [_32~]   gives  an  algorithm  to  test  for  the 

existence  of  cycles;  when  put  into  tree  program  form,  this 

2 
algorithm  has  a  height  of  0(n  )  for  a  graph  with  n  nodes. 

This  order  of  magnitude  is  minimal,  for  we  have: 

Theorem  10:   For  all  F  and  G,  if  a  tree  program 
in  T(F,G)  determines  whether  n  by  n  binary  matrices  con- 
tain a  cycle,  then  that  tree  program  has  height  at  least 
n(n  -  l)/2. 

Proof:   Let  P  be  a  tree  program  which  determines 
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whether  n  by  n  matrices  contain  a  cycle,  and  let  M  be  an 
n  by  n  binary  matrix  which  does  not  contain  a  cycle. 
Observe  that  there  are  (!?)  =  n(n  -  l)/2  unordered  pairs 
(i,  j)  where  1  <  i,j  <  n.   Assume  P  has  height  less  than 
n(n  -  l)/2.   Then,  P  could  not  have  "used"  every  one  of 
the  n   inputs  and  hence  there  must  be  a  pair  (i,  j)  such 
that  P  used  neither  M(i,  j)  nor  M( j ,  i) .   Let  M'  be  equal 
to  xM  except  that  M'(i,  j)  =  M'(j,  i)  =  1;  thus  M'  will 
contain  a  cycle.   Clearly,  P  follows  the  same  path  on  M' 
as  on  M  and  hence  it  must  make  an  error  on  one  or  the 
other.   Contradiction,  so  P  must  have  height  at  least 
n(n  -  l)/2.   Q.E.D. 

Notice  that  a  slightly  higher  limit,  n(n  +  l)/2, 
is  easily  obtained  since  any  correct  algorithm  must  also 
use  all  n  diagonal  elements  of  M  before  concluding  that  M 
contains  no  cycles. 

Theorem  11:   For  arbitrary  F  and  G,  if  a  tree 
program  in  T(F,  G)  determines  whether  node  i  is  connected 

to  node  j  in  n  by  n  binary  matrices,  then  the  tree  program 

2 

has  height  at  least  (n   -  l)/4. 

Proof :   Let  P  be  a  tree  program  which  determines 
whether  node  i  is  connected  to  node  j .   The  theorem  will 
be  proved  by  constructing  a  matrix  which  is  "difficult" 
for  P  to  analyze.   Assume  that  n  is  even,  a  similar  con- 
struction works  when  n  is  odd.   We  define  M  as: 
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M(k,  2)  =  1  for  k  even, 
M(l,  I)    =  1  for  I   odd,    L  (13) 

M(k,  j)  =0  otherwise.   J 
The  graphical  interpretation  of  (13)  is  illustrated  below. 


a^^^JE) 


If  any  odd  numbered  node  becomes  connected  to  any  even 
numbered  node  in  M,  then  node  1  will  become  connected  to 
node  2.   There  are  n/2  odd  numbered  nodes  and  n/2  even 

numbered  nodes,  so  the  number  of  elements  M(i,  j)  such 

2 
that  i  is  odd  and  j  is  even  is  n  /4.   Assume  P  has  height 

2 
less  than  n  /4.   In  M  as  defined,  node  1  is  not  connected 

to  node  2,  and  this  is  what  P  must  conclude;  however, 

2 
since  P  used  fewer  than  n  /4  operations,  there  is  an 

ordered  pair  (i,  j)  such  that  P  did  not  use  M(i,  j),  where 

i  is  odd  and  j  is  even.   Let  M'  be  equal  to  M  except  that 

M'(i,  j)  =  1,  so  that  node  1  is  connected  to  node  2.   P 

will  not  look  at  M'(i,  j),  and  so  P  must  conclude  that 

node  1  is  not  connected  to  node  2  exactly  as  it  concluded 

for  M.   Contradiction;  thus  P  has  height  at  least 

2       2 
n  /4  >  (n   -  l)/4  operations  to  determine  if  node  i  is 

connected  to  node  j.   Q.E.D. 
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Dreyfus  [8],  Dijkstra  [7J,  and  Pohl  [47]  give 

2 
0(n  )  algorithms  to  find  the  shortest  path  between  two 

nodes  of  a  graph.   Finding  the  shortest  path  requires 

determining  whether  or  not  such  a  path  exists  and  thus 

2 

these  0(n  )  shortest  path  algorithms  are  optimal  in  order 

of  magnitude. 

2 
McCreight  C33J  gives  an  0(n  )  algorithm  to 

determine  whether  a  graph  is  separable,  i.e.  if  it  is  not 

strongly  connected.   That  this  order  of  magnitude  is 

optimal  is  seen  by 

Theorem  12:   For  arbitrary  F  and  G,  if  a  tree 

program  in  T(F,  G)  determines  whether  an  n  by  n  binary 

matrix  is  strongly  connected,  then  the  tree  program  has 

2 
height  at  least  (n   -  l)/4. 

Proof :   The  proof  is  the  same  as  in  the  previous 
theorem,  except  that  we  define  M  as: 

M(l,  2)  =  1 

M(i,  i+2)=M(i  +  2,  i)  =1  for  1  <i  <n  -  2   (14) 

and  M(i,  j)  =  0  otherwise. 
The  graphical  interpretation  of  (14)  is  shown  below.   Q.E.D, 


CHAPTER  3 
Open  Problems  and  Conclusions 

We  have  introduced  a  fairly  simple,  realistic 
model  of  algorithms  and  have  used  it  to  facilitate  so 
called  "information  theoretic"  arguments.   The  concept  of 
the  width  of  a  tree  program  being  a  measure  of  the  com- 
plexity of  the  algorithm  is  very  natural  and  it  yields 
results  not  only  about  the  width,  but  about  the  height  and 
the  number  of  comparisons  as  well.   This  model  allows  the 
combination  of  these  information  theoretic  arguments  on 
the  width  together  with  linear  algebraic  arguments,  as  in 
Theorem  2  and  Theorem  9,  has  yielded  a  powerful  technique 
for  handling  problems  in  which  one  is  not  restricted  to 
pairwise  comparisons  between  inputs,  but  one  is  allowed 
instead  to  compare  linear  functionals  of  the  inputs.   It 
is  unfortunate  that  no  similar  technique  exists  for  prob- 
lems in  which  comparisons  are  allowed  between  polynomial, 
rational,  or  exponential  functions  of  the  inputs,  and 
these  are  now  some  key  open  problems.   For  example: 

a)  In  1r   ,  *£ ,  and  (5,  what  is  the  least  width 
required  to  determine  whether  two  sets  are 
equal? 

b)  In   <r  ,  fiu  ,  and  £,  what  is  the  least  width 
required  to  determine  the  median  of  a  set? 

c)  In  f     and  K ,  what  is  the  least  width  re- 
quired to  determine  the  maximum  of  a  set? 
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The  search  for  optimal  algorithms  generally 
proceeds  along  the  following  lines:   One  tries  to  prove 
the  optimality  of  an  existing  algorithm,  for  example  that 
it  has  minimal  width;  frequently  no  such  proof  is  forth- 
coming and  the  realization  comes  that  there  may  be  a 
better  algorithm,  not  necessarily  more  efficient  or  faster, 
but  one  which  uses  less  of  whatever  resource  is  being 
measured.   This  is  precisely  the  case  with  the  new  algo- 
rithms given  here  for  computing  the  maximum  and  the  median 
of  a  set;  these  algorithms  are  very  inefficient,  but  they 
do  make  the  best  use  of  the  information  available,  that 
is,  they  use  the  fewest  comparisons. 

The  minimal  width  is  very  closely  associated 
with  the  "amount  of  information"  which  is  required  to 
solve  a  problem;  since  this  is  the  case,  it  might  have 
been  hoped  that  as  long  as  the  set  of  functions  is  igno- 
rant, the  minimal  width  remains  invariant.   Theorems  7 
and  9  illustrate  that  this  is  not  the  case.   On  the  other 
hand,  it  is  an  open  question  whether  or  not  everything 
which  can  be  computed  by  a  tree  program  in  T(F,  G)  can  be 
computed  by  a  tree  program  with  width  one  in  T(F  U  {maxi,  G)  , 
where  F  U  G  is  ignorant.   If  we  consider  T(F  U   {sign},  G) , 
then  everything  in  T(F,  G)  can  be  computed  by  a  tree  pro- 
gram with  one  terminal  node. 

What  about  some  "hard"  graph  problems?   Can  one 
get,  by  tree  programs  or  otherwise,  a  proof  of  minimality 
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for  some  existing  algorithm  which  tests  graph  isomorphism? 
Graph  planarity?   These  problems  seem  to  be  almost  intrac- 
table; the  most  efficient  algorithms  for  determining  graph 
isomorphism  all  require  on  the  order  of  nj  or  n  operations 
in  the  worst  case,  although  some  do  far  better  than  this 
on  various  subclasses  of  inputs.   Is  this  order  of  magni- 
tude optimal? 

Given  a  "realistic"  model  of  computation  such 
as  the  one  used  by  Pan,  Belaga,  and  others,  very  little  is 
known  about  the  minimality  of  algorithms  which  compute 
f(n),  for  many  functions  of  mathematical  interest.   One 
can,  of  course,  diagonalize  to  create  functions  which 
cannot  be  computed  in  time  less  than  T(n) ,  for  a  given  T; 
but  what  about  functions  like  the  n   prime,  ni  ,  and 
others?   One  of  the  only  interesting  results  of  this  type 
is 

Theorem  13:   The  n   Fibonacci  number,  F  ,  can 

'   n' 

be  computed  in  O(log  n)  arithmetic  operations  from 

{ +,  -,  •,  /j  ,  and  that  order  of  magnitude  is  minimal. 

Proof:   We  need  the  fact  that  F   >  2n' "  for  all 

n  - 

n  >  6;  this  is  proved  by  an  induction  argument.   Also  by 
induction,  one  can  prove  that  in  k  operations  from 

{+,  -,  *,  /  }  ,  the  largest  number  which  can  be  computed 

9k 

on  input  n  is  nz  .   Thus  if  F   is  computed  in  k  such 

operations,  we  have 

n2  >    Fn    >    2n/2. 
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Solving  this  for  k  we  obtain 

k  >  log2  n  -  log2  2  -  log2  log2  n, 

and  hence 

k  >  O(log  n) . 
This  establishes  the  minimality  of  O(log  n)  operations. 

F   can  in  fact  be  computed  within  this  bound  by 
a  binary  algorithm  given  by  Knuth  C30,  solution  to  problem 
26  of  6  4.6,3]]  .   This  algorithm  is  based  on  the  facts 
that  given  (Fk,  F^)  then  (Pk+1,  Ffc)  =  (Fk  +  F^,  Fk) 

and  (F2k>  F2k-1}  =  (Fk  +  2VW  Fk  +  Fk-1}-   Q'E-D- 

When  the  above  size  argument  is  applied  to  P  , 
the  n   prime  we  find  that  the  lower  bound  obtained  is  a 
constant,  since  P   =  O(n*log  n) ;  it  is,  of  course, 
improbable  that  the  two  hundred  billionth  prime  could  be 
computed  as  quickly  as  the  tenth.   When  this  size  argument 
is  applied  to  nl ,  the  lower  bound  derived  is  O(log  n) , 
and  again,  it  is  hard  to  believe  that  ni    could  be  computed 
that  rapidly.   Unfortunately,  there  are  no  other  available 
techniques  to  handle  this  type  of  problem. 
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